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1. Introduction 

Let Xi , . . . , Xn be a sample of unobservable random variables from an unknown 
distribution function Fq on the interval [0, 1]. More generally, we could take an 
arbitrary closed interval [a, h] as support for the underlying distribution, but for 
the purposes of the development of the theory, we can just as well take [0, 1], as 
is also done in [1]. 

Suppose that one can observe n pairs (T^, [7^), independent of Xi, with a joint 
density function h on the upper triangle of the unit square, for which the sum 
of the marginal densities is bounded away from zero. Moreover, 

Aji = 1{X.<T,}, Ai2 = l{T.<Xi<(7i}, = 1 - Ai^i - Ai_i, (1.1) 

provide the only information one has on the position of the random variables 
Xi with respect to the observation times Ti and Ui. In this set-up we want to 
estimate the unknown distribution function Fq, generating the "unobservables" 
Xi. This setting is known as interval censoring, case 2. 

1 

imsart-ejs ver. 2011/12/01 file: interval.tex date: December 13, 2011 



p. Groeneboom and T. Ketelaars/ Interval censoring 



2 



The model of current status data, also known as interval censoring, case 1, 
has been thoroughly studied, and has a theory which is considerably simpler 
than the theory for the interval censoring, case 2, model. In the current status 
model one only has one observation time T^, corresponding to the unobservable 
Xi, and the only information we have about Xi is whether Xi is to the left or 
to the right of Tj . 

Although the present paper mainly focuses on the case 2 model, we start 
by discussing the current status model, in order to put this paper into a more 
general context and to explain why the case 2 model is so much harder to study. 
In the current status model, the only observations which are available to us are 
the pairs 

so we do not observe Xi itself, but only its "current status" A^. The nonpara- 
metric maximumum likelihood estimator, commonly denoted by NPMLE or just 
MLE, maximizes the (partial) log likelihood 

n 

{A, log F(TO + (1 - AO log (1 - F(T,))} , 

1=1 

where the maximization is over all distribution functions F . 

The MLE can be found in one step by computing the left-continuous slope 
of the greatest convex minorant of the cusum diagram of the points (0, 0) and 
the points 

i,^A(,) ,i = l,...,n, (1.2) 

using a notation, introduced in [10]. Here ^(j) denotes the indicator corre- 
sponding to the jth order statistic Ti^j) - The theory for this estimator is further 
developed in [10], where also the (non-normal) pointwise limit distribution is 
derived and it is shown that the rate of convergence is n~^^^ . 

In contrast, there is no such one-step algorithm for computing the MLE in 
the case 2 situation, where one wants to maximize 

n 

J2 ^og F{Ti) + A,2 log{F(C/,) - F{T,)} + A,^ log (1 - F{U,))} . 

i=l 

over distribution functions F. One has to take recourse to iterative algorithms, 
for example the iterative convex minorant algorithm, introduced in [10] and 
further developed in [11]. Moreover, the MLE can possibly achieve a faster local 
rate of convergence than in the current status model, depending on properties 
of the bivariate distribution of the observation times (T^, Ui). 

In the so-called non-separated case, the density of the pair of observation 
times {Ti,Ui) is positive on the diagonal, meaning that we can have arbitrar- 
ily small observation intervals [Ti,Ui]. For this situation, [1] proposes a simple 
piecewise constant estimator for Fq, with the purpose of showing that in this 
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situation an estimator can be constructed that achieves the (nlogn)^^/'^ con- 
vergence rate, which is optimal in a minimax sense, both using a global loss 
function , and using a local loss function for the estimation at a fixed point. In 
the separated case, the observation times Ti and Ui cannot become arbitrarily 
close: in this case there exists an e > so that Ui — Ti > e for each i. In this 
case the convergence rate of Birge's estimator is n~^/^ again, which is also the 
minimax rate for the current status model. For both situations we derive the 
asymptotic behavior of Birge's estimator, and compare this with the behavior 
of the MLE in a simulation study. The simulations show a better behavior of 
the MLE, probably caused by the local adaptivity of the MLE. 

A common complaint about the MLEs is that under the conditions for which 
the local asymptotic distribution result is derived, other estimators can be sug- 
gested, which in fact attain a faster rate of convergence. Such estimators are 
discussed for the current status model in, e.g., [8], [9] and [7]. We introduce a 
similar estimator below for the case 2 model below, the smoothed maximum 
likehhood estimator (SMLE). The smoothed MLE is defined by 

Fn'-it) - jK({t- u)/bn) dP^iu), (1.3) 

where 

/{ , w < -1 

/ K{w)dw , we [-1,1], 
1 , w > 1, 

letting if be a smooth symmetric kernel, with support [—1, 1], like the triweight 
kernel 

and taking the bandwidth 6„ x n^^/^. Note that 

fnt) = ^ / ^ - -)/^") ^^«(") 

is an estimate of the density /o of the underlying distribution function Fq. 

Analogously to what has been proved for the current status model, we ex- 
pect the smoothed MLE to converge at (at least) rate n"^/^ under appropriate 
regularity conditions. It is an attractive alternative to the MLE and histogram- 
type estimator of [1]. We give a heuristic discussion on this in section 6. Just 
as in [3] and [4], the asymptotic variance depends on the solution of an inte- 
gral equation. The asymptotic expressions for the variance, obtained by solving 
these equations numerically, give a rather good fit with the actually observed 
variances, as shown in section 6. The SMLE can probably also be used for a 
two-sample test for interval censored data, analogous to the two-sample test 
for current status data, introduced in [7]. The MSE of the smoothed MLE is 
much smaller than that of Birge's estimator or the MLE for smooth underlying 
distribution functions, as is illustrated in the sections on the simulations. 
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A picture of the three estmiators is shown in Figure 1. The MLE and smoothed 
MLE are monotone, in contrast with Birge's estimator. Also Birge's estimator 
can have negative values and values larger than 1; both events happen in the 
picture shown. This cannot happen for the MLE and smoothed MLE, since these 
are based on isotonization; the smoothed MLE is an integral of a positive kernel 
w.r.t. the (positive) jumps of the MLE, and inherits the monotonicity proper- 
ties of the MLE. Although histogram-type estimators (like Birge's estimator) 
and kernel estimators without any isotonization are much easier to analyze than 
the estimators, based on isotonization, the price one has to pay is the behavior 
illustrated in Figure 1. 
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Fig 1. Birge's estimator (dashed), the MLE (dotted), and the smoothed MLE (dashed- dotted) 
for sample size n = 1000 and bn = n~^^^ , when Fq{x) = 1 — (1 — x)^ (solid curve) and the 
observation distribution is uniform on the upper triangle of the unit square. 



2. A local minimax result for the non-separated case 

In this section we derive a local minimax result for the non-separated case of 
the interval censoring problem, case 2. This result will provide the best possible 
local convergence rate and also the best constant, as far as this constant depends 
on the underlying distributions. 
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Our approach makes use of a perturbation of Fq which is defined by 

Fn{x) -- 



Ffj{x) li X < — c{n\ogn) 

■Fb(to - c(nlogn)~i/3) if x e [to - c(n log n)^^/^, to) 

i^o(to +c(nlogn)"i/3) ^ ^ [to,to + c(nlogn)~i/3) 

Ff){x) if x > to + c(7ilogn)~^/^ 



for a c > to be specified below. 

Before stating the theorem to be proved, we introduce some notation. Let 
A = (Ai, A2) e T := {(1,0), (0, 1), (0,0)} and define the densities go and g„ by 

q^(t,u,5) = h{t,u)Fo{t)''{Fo{ti) - Fa{t)f-{1 - F^{u)f 

qn{t,U,d) = h{t,u)Fr.{tf'{Fr.{u) ~ ^^„(t))'Hl - F„ (u) ) ^-^'^ 

with respect to the measure = Ai (8) A2 on 51 = x T, where Ai is the 
Lebesgue measure and A2 is counting measure. We note that qo is the joint 
density of (T, [/, Ai, A2). 

Furthermore, let {Ln),n > 1, be a sequence of estimators for Fo(to), based 
on samples of size n, generated by go- That is, we can write 

((Ti, f/i, Ai,i, Ai,2), . . . , (r„, Un, A„,i, A„,i)), 

where /„ is a Borel measurable function. Then, the following theorem holds: 
Theorem 2.1. 

lim inf(nlogn)^/^max{£'„,5JL„ - Fo(to)|, |L„ - F„(to)|} 

n—>-oo 

> ^ exp(-l/3){/o(to) 7/^(^0, to)V^\ 

where En.q denotes the expectation with respect to the product measure g®". 

In our proof we need the following lemma, which is proved in [6] . This type 
of result is often denoted as "LeCam's lemma" . 

Lemma 2.1. Let G be a set of probability densities on a measurable space 
(Q,A) with respect to a a-finite dominating measure fi, and let L be a real- 
valued functional on G. Moreover, let f : [0, 00) M. be an increasing convex 
loss function, with f(0)=0. Then, for any 91,(72 G G such that the Hellinger 
distance H{qi,q2) < 1 : 

inf max{i;„_^j/(|i„ - Lqi\), En^qJ{\Ln - ^92!)} 
>/Q|igi~Lg2|{l-i?7gi,g2)}'" 



Proof of theorem 2.1. Let the partitioning Ai „ U . . . U Ag.n of {{t,u) E M"^ : 



imsart-ejs ver. 2011/12/01 file: interval.tex date: December 13, 2011 



p. Groeneboom and T. Ketelaars/ Interval censoring 6 

t <u} he defined by 

Ai,n = {{t,u) e : < i < <o - ^n^h - Sn < u < to)} 
M.n = e : < t < to - KM <u<tQ + Sn} 

As.n = {{t, u) e M.1 : to - 6n < t < to,to + Sn < u < oo} 
A4,n = {{t, u) eR\:to<t<to + Sn,to + Sn < u < oo} 

A^^n = {{t, U) eRl:to-Sn<t<to + Sn, t < U < to + Sn} 

Ae.n = {{t, u) e : i < w}\{Ai,„ U . . . U ^5,n}, 
where Sn — c{n\ogn)^^/^ . The partitionmg is shown in figure 2. 





As,™ 


'4-4. n 




-42, n 


As + io) 


-4l,n 


/(to,io) 


/(iO — in, to) 

^6,n / 



(0,0) « 



Fig 2. T/ie areas Ai,„, . . . , Ag,^ 



Then the squared Helhnger distance between go and q„ can be written as 
+ [ (V-P^nH - F„(t) - ^Fo{u) - Foit)^^ dtdu 

k=l "''4fc,„ 
fc=l "''4fc,„ 

We now calculate the three integrals over Ai^n- 
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j h{t,u)[yFji)- ^/f^^'' dtdu = Q. (2.1) 



Furthermore, 



h{t, u) - - ^ dtdu 



12(Fo(to) - FoW) 



The last integral can be split into two integrals over the sets [0,to — k„) and 

[to — K,n,to — Sn], where k„ = (logn)^^/'^. Since 

Jo 12(Fo(to)-Fo(t))''* = ^^^"'^^ ) 



and 



to-<5„ 



= (/o(io)(^^ + o(,5^))/i2) \h{to,to) + o{l)) .Jf\^fl.. dt 



to- 

= {fo{to)h{to, to){6l + o{6l))/12) [- log{Fo{to) - FomZ-l" 
= fo{to)h{to,toyn-^/36 + 0(71"^), 

it follows that 

j h{t, u) [^J Fn{u) ~ Fn{t) - ^Fo{u)-Fo{t)'^ ' dtdu 
= fo{to)h{to, to)c^n-^/m + o{n-^). (2.2) 
Next, a straightforward computation shows that 

h{t, u) (Vl - Fn{u) - v/l-Fo(u)) ^)dtdu 



-to + 5nfh{tof 

4(1 - Fo(io)) 



dtdu = 0{5i). (2.3) 
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Using (2.1), (2.2) and (2.3), we get 

h{t, u) (/fU^ - /pRi)) ' dtdu 

+ (V^„(u) - F„(i) - ^Ff,{u) - F^{t)^^ dtdu 

+ J (Vl - ^n(w) - v/1 - Foiu)ydtdu 
^ fo{t„)h{to,to)n-'/^6 + 0{SlK-'). 
The integrals over A2,mA^,n and A^^n can be treated in a similar way. Indeed, 

h{t, u) (yPjt) - ^Foit)'^ ' dtdu 

+ [VFniu) - Fnit) - VFo{u) - Foit)ydtdu 

+ (Vl - Fn{u) - Vl - Foiu)^ dtdu 

= h{h)h{toM)n-^/i^ + 0{5lK-^), k = 2,3,4. 
Moreover, it is easily verified that 

h{t, u) (y/Fjf) - /fM*)) ^ dtdu 

[VFn{u)-Fn{t) - VMu}- Fo{t)^ ' dtdu 

+ 1 (Vl - Fn{u) - Vl - Fo{u)ydtdu = 0{6l). 



Thus, we infer that the asymptotic squared Hellinger distance between go and 
Qn is given by 

H\qo,qn) = Mto)hito,to)n-^/18. 
By using lemma 2.1 we now get: 

(nlogn)i/3inf max{£;„,,jT„ - Fo(io)|, £^„,gjT„ - F„(io)|} 

> J(nlogn)i/3|F„(to) - F„{to)\{l - i/'(g„,<Zo)r 

1 f 1 

^ ^c/o(io)exp|- — /i(io,io)/(to)c^ 

Maximizing the last expression over c yields the desired minimax lower bound. 

□ 
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3. Asymptotic distribution of Birge's estimator in the 
non-separated case 

[1] constructed a histogram-type estimator to show that the minimax lower 
bound rate of the preceding section can indeed be attained in the non-separated 
case. It is defined in the following way. Let tp be an interior point of [0,1], let c 
be a positive constant and let K — [c"^(nlogn)^/'^J, where n is the sample size 
and where \x\ denotes the "floor" of x, i.e., the largest integer which is smaller 
than or equal to x. We distinguish two cases. 

(i) If Kt[) e N, the interval [0,1] is partitioned into K intervals /j, j = 
l,...,K, of equal length l/K, where Ij = [tj,tj+i), ^ < j < K, = 
[tK,tK+i], and ti = 0, tK+i = 1. 

(ii) If Kta ^ N, the interval [0, 1] is partitioned into K + 1 intervals Ij, where 

= [^j'^j+i)' ^ < j < Ik+1 = [tK+i,tK+2], and h = 0, tj = h - 
{\toK\ - j) /K, I < j < K + 1, tK+2 = 1- Note that in this case the 
intervals I2, ■ ■ ■ ,Ik have length l/K, but that /i and Ik+i have a shorter 
length. Furthermore, just as in case (i), to is the left boundary point of 
one of the intervals Ij . 

In fact we slightly modified the definition of Birge who always partitions the 
interval into K subintervals of equal length. The reason for our modification 
is that we want to assign a fixed position to Iq with respect to the boundary 
points of the interval Ij to which it belongs, since the bias of the estimator 
heavily depends on this position. Letting to be a left boundary point enables us 
to compare the results for different sample sizes "on equal footing" , so to speak. 

Let Ai^i, Ai^2 and A^^ be defined by (1.1). We define, following [1], for 

N, = # {T, : T, e Ij} , M, e /,} 

= # {(T„ [/,): 7^. e e/fc}, 

and 

In addition to these (integer-valued) random variables, [1] defines the random 
variables: 

,J<k, 

pUM - } Qj,k ' ' -|^ 

~ 1-^ + ^ J>k 
Mk Qk,j 
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weights Wj k J defined by 



ik-j + l)W, 
{j-k+l)Wj 



j < k, 
3 > k, 



(3.2) 



where 



' ^ j - k + 1 ^ k-j + 1 ■ ^ ' ^ 

k<j ■ k>j 

We are now ready to define Birge's estimator Fn. 

Definition 3.1. (Birge's estimator) Let the intervals Ij be defined as in (i) 
or (ii) above (depending on the value of to), and let F^-'-'^) and the weights Wj^k 
be defined by (3.1) and (3.2), respectively. Then, for t belonging to the interval 
Ij, Birge's estimator Fn{t) of Fo{t) is defined by 

F^it)= J2 Wi,kF'-^-'^- (3.4) 

In determining the asymptotic distribution of Birge's estimator, we are faced 
with the following difficulties. 

(1) The weights Wj^k are ratios of random variables, which interact with the 
random variables M'f,/Mk, N^Nk and Q'^ k/Qj.k, for which they are mul- 
tipliers. 

(2) The ratios M'j^/Mk, N'f./Nk and Q'j k/Qj,k are themselves ratios of random 
variables. 

(3) The weighted sum, defining Birge's estimator, consists of dependent sum- 
mands. The dependence is caused by the dependence of the weights, the 
dependence between the M^/Mfc, N^Nk and Q'^^IQi^k and the depen- 
dence between the weights and these terms. This prevents a straightfor- 
ward use of the Lindeberg-Feller central limit theorem. 

These difficulties have to be dealt with in turn. The following crucial lemma 
bears on difficulty (1), by showing that the random weights ^ are close to 
deterministic weights Wj^fc. 

Lemma 3.1. Consider a partition of [0, 1] into K or K + 1 suhintervals, ac- 
cording to the construction of Birge's estimator, using the scheme of (i) and (ii) 
at the beginning of this section. Assume that, for a fixed constant c > 0, 

(yilogn)^/'^ 

K^Kn^^ ^ ,7i->oo, 3.5 

c 

that is: the asymptotic binwidth is given by c(7ilogn)^^/'^. Moreover, assume 
that the observation density h is continuous on the upper triangle of the unit 
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square, staying away from zero on its support. Let gi and 92 be the first and 
second marginal density of h, respectively. Finally, let to be the left boundary 
point of Ij, let a{t) and b(t) be defined by 



a{t) = ^h{to,t)Agi{t), b[t) = ^h[t,ta) ^92{t), 
and let the deterministic weights Wj_k be defined by: 

3a{tk) 



Then: 



{aito)+b{to)}{k-j + l)\ogn 

mk) 

[ {aito)+b{to)}ij -k + l)\ogn 



,k> j, 
,k<j. 



(3.6) 



(3.7) 



sup(l + \j - k\)E \wj,k - Wj.fcl = o(l/logn) , n ^ 00. (3.8) 
(ii) Wj, defined by (3.3), satisfies 

Wj = \{\ogn)^^{a{to) + b{to)}{l + Op(l)} , n ^ 00, (3.9) 
and, for m — 1,2, . . . 

E {l/M^n l{w,>^} ^ (9if/n)"/2 + i,(to)) logn}-™ , n -> ex.. 

(3.10) 



It may be helpful to give some motivation for the construction of Birge's 
statistic. If we replace Nk,N'j^, etc. by their expected values, we obtain: 

/ !,Jo{u)dG,{u) ,,,Jfo(^)-fo(^)}dg(^,^) | 
(^p^'\G,{t,^,)-G,{t,) 4,^. ^^^^ dH{t,u) ] 



W],k < 1 - 

k<i I 



j,\l-FomdG2{u) 



G2{tk+l) — G2{tk) 

J^^j^ ^^j^jFoju) - Fo{t)}dH{t,u) \ 
where Gi and G2 are the first and second marginal distribution functions of H, 
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repectively. By expanding at the left endpoints t}. of the intervals, we get: 

k>j 

+ Wj,k {!-{!- Foitk)} + {Fait,) - Foih)}} 



k<j 



]_ j fo{tk)gi{tk) 

-y 



{fojtk)- fo{tj)}h{t,,tk) 

h{tk)gi{tk) , {h{t,)- h{tk))h{tk,tj) 




]_sp„„_. f Mtk)gi{tk) , Ifoitj)- foitk)} hitk,t,) 



G2{tk+i) ~ G2{tk) K J^^j^^^^j^ dH{t,u) 



foitk)gi{tk) {foitk) - foitj)}hitj,tk) 



2K^p'''\ g^itk) ^ hitk,t,) 



^oitj) + ^ E "'J.fc - {/o(*''-) - foitj)}} 

k>j 



2K 

k<] 

= Foit,) + :^foitj) + ... (3.11) 

One of the difficulties in this expansion that we have glossed over for the moment 
is that gi{tk) tends to zero, if tk — >■ 1, and that similarly 52(ife) tends to zero, 
if tk — )• 0. This difficulty has to be dealt with separately. We do not have that 
difficulty for /i, since we assume that h stays away from zero on its support. 

The expansion suggests that the asymptotic bias at tj will be foitj)/i2K), 
which is indeed the case. However, the expansion does not explain the particular 
choice of the weights. Considering the deterministic counterparts "Wj^k of Wj,fe, 
given by (3.7) in Lemma 3.1, we see that the weights are proportional to 1/(1 + 
\j — k\), which has the effect that the smaller observation intervals give the 
biggest contribution to the estimator, taking advantage of the fact that the 
smaller observation intervals do indeed give more precise information on the 
"unobservable" Xi, if we know that Xi is contained in the interval (see the 
discussion on this point in section 1. The choice of these weights reduces the 
variance of the estimator. Only this fact is responsible for the fact that the rate 
of convergence is slightly faster than ri~^/^. 

It seems that the MLE is doing something similar automatically, but in a 
more efficient way, if we believe the "working hypothesis" , discussed in section 
1. Assuming the truth of this "working hypothesis", the asymptotic variance 
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of the MLE only involves the local joint density h of {Ti,Ui) at {to, to) and 
the density /o(io) of Xi at to, whereas the variance of Birge's estimator also 
involves the marginal densities of (Ti,Ui), which do not appear in the local 
minimax lower bound, derived in section 2. 

Also note that the partition, needed in the construction of Birge's estimator, 
is dependent on an a priori knowledge of whether we are in the separated or 
non-separated case; in the non-separated case binwidths of order {n\ogn)~^^^ 
are taken (otherwise the higher rate (nlogn)"^/'^ would not be attained), and 
in the separated case binwidths of order n~^^^ (taking {n\ogn)~^/^ would let 
the variance dominate the bias, as the sample size tends to infinity). For the 
computation of the maximum likelihood estimator (MLE) , discussed in section 
5, it is not necessary to use a priori knowledge on the observation distribution; 
the MLE, considered as a histogram adapts automatically to the separated 
or non-separated case and will choose generally smaller binwidth for the non- 
separated case. This is one of the major advantages of the MLE over Birge's 
estimator, apart from being monotone with values restricted to [0, 1]. 



Using the notation of Lemma 3.1 we can now formulate the main result for 
Birge's estimator. 

Theorem 3.1. Let the observation density h satisfy the same condition as in 
Lemma 3.1, and let Fo have a continuous derivative fo on (0,1), satisfying 
fo{ta) > 0- Furthermore, let /j"-* be a subinterval, belonging to the partition of 
[0,1] into K intervals, corresponding to the construction of Birge's estimator 
for a sample of size n, where K is defined by (3.5) in Lemma 3.1. Finally, let 
an be defined by 

an = {n\ogn)-^/\ (3.12) 

and let tj"'* be the left boundary point of /j"', for which we assume that it 
converges to an interior point to G (0, 1), as n ^ oo. Then: 



"n' {Pn (4"') - ^0 (4"') } ^ N {\cfo{to),cjl) ,n^^. (3.13) 

where the right-hand side of (3.13) denotes a normal random variable, 
with expectation \cfo{to) and variance 

2 mtn){a{t^? + b{tof] 

Go = 7j , (3.14) 

ch{to,to){aito) + b{to)f 

and where c, a(to) ond b{to) are defined by (3.5) and (3.6). 

(ii) 

lim a-^E{F„{tj) ~ Fo{tj)] = \cfo{to), (3.15) 

and 

Jiirn^ a-^var {f„ (4"^) } = a^. (3.16) 
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Note that Theorem 3.1 imphes that the optimal value of c is given by 
This value of the constant was used in the simulations, reported below. 



4. Birge's estimator in the separated case 



We consider the asymptotic behavior of Birge's estimator in the separated case. 
This is mainly meant for illustrative purposes and to give background to the 
simulation study. We therefore do not aim to prove results in the widest gener- 
ality and confine our discussion to the case where the density h of the observed 
pairs {Ti, Ui) has as support the triangle with vertices (0, e), (0, 1) and (1 — e, 1) 
and stays away from zero on its support, which is the situation we consider in 
the simulation study. In this case the faster rate (nlogn)"^/"^ is unattainable, 
and we know that Birge's estimator (and also the MLE) can only achieve the 
rate n~^/^. We therefore assume K to be of order -n}/^ and set K = \cr^v}-/^\. 

As in section 3 we introduce deterministic weights Wj^k to replace the random 
weights Wj^k- Recall that, by definition, 



and 



W. 



E 



f y/iVfc A (Xg,- fc) 

{k-j + l)Wj 

{ {j-k + l)Wj 
y/Mk A {KQj^k) 



l<k<j 



E 



, j > k, 

^Nk A {KQj^k) 
k-j + 1 



(4.1) 



j<k<K 

Let gi and g2 be the first and second marginal density of h, respectively, that 



is: 



gi{t)^ I h{t,u)du,g2{t)= h{t',t)dt\te [0,1]- (4.2) 



Then, if 2e < to < 1 - 2e, 



E 



^cny3{h{tk,tj)Ag2itk)} 



fc:t,— tfc>e 



.1/3 



j-k + 1 



^c{hit,to)Ag2{t)} 
to-t 



E 

k:tk—tj >e 

dt + n^/^ 



v/cnV3{fe(tj-,tfc) Agi(tfc)} 
k-j + 1 



^c{h{to,t)Agi[u)} 



to+e 



t-t^ 



dt, 
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showing Wj x •n}/^ . The deterministic weights Wj^^ are now defined by: 



KW{to) {to - tk) 

y^h{tj,tk) A gi(tfc) 
KW{to) {tk - to) 



k<j, 



(4.3) 



where 



^ f-' VHt,to)^9.{t) ^ Vh{to,u)Ag.{u) ^^^^^ 

Je to-t Jtg+t U-to 

We assume that the integrals on the right-hand side of (4.4) are finite, and hence 
that W{to) < oo. 

We now have the following lemma, which plays a similar role as Lemma 3.1 
in section 3. 

Lemma 4.1. Consider a partition of [0, 1] into K or K + 1 subintervals, ac- 
cording to the construction of Birge's estimator, using the scheme of ft) and fti) 
at the beginning of section 3. Assume that 



K = K„ 



„l/3 



C 

for a fixed constant c > 0, that is: the asymptotic binwidth is given by cn^^l^ . 
Let the weights Wj^k o,nd Wj^k be defined by (4-1) and (4-3), respectively, where 
we assume W{to) < oo. Then: 

sup (1 + \j - k\) \wj^k - Wj,k\ = Op (n'^^^) , (4.5) 

Using this lemma, we get the following limit result (compare with Theorem 
3.1). 

Theorem 4.1. Suppose that the observation density h has as support the tri- 
angle with vertices (0, e), (0, 1) and (1 — e, 1) and stays away from zero on its 
support. Let Fq have a continuous derivative /g on (0, 1), satisfying fo{to) > 0. 
Moreover, let /^"' be a subinterval, belonging to the partition of [0, 1] into K 
intervals, corresponding to the construction of Birge's estimator ^ a sample of 
size n. Finally, let W{to) be defined by (4-4) j where we assume W{to) < oo. 

Assume that, for a fixed constant c > 0, K = Kn ''^ n^^^ /c, and let i^"'' be the 
left boundary point of /^"■* , for which we assume that it converges to an interior 
point to € (0, 1), as n oo. Then we have, as n ^ oo 

nV3|F„(4")) -_Fo(4"))} AiV(ic/o(to),cT2) (4.6) 
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where the right-hand side of (4-6) denotes a normal random variable, with ex- 
pectation ■^cfo{to) and variance 

cW{toy Jto+e n(to,u)(u-to) 
_1 p'-' 92it)Ah{t,to) 

cW{toY L h{t,ta){h-tf 

(4.7) 



+ / iu i Z. % {^o(io) - F,{t)} {1 - (Fo(io) - F,m dt 



In the simulation study we take the observation density h uniform on the 
triangle of its support. For ease of reference, we here determine the value of 
the variance cr^ of the asymptotic distribution for this case. If h is uniform, its 
density is given by 

^^^^ '^^ I 0, elsewhere ' (4-^) 

Hence the marginal densities gi and g2 are given by: 
2 , 2{l-t-e} 

{l-efJt+e 

and 

For W{to) we get: 

Wito) = / '- dt + / '- du. 4.9 



Hence, using (4.7), we obtain: 
= -^J— C ' ] ~ {Fo{u) - Fo{to)} {1 - {Fo{u) - FM)} du 



cW{toy Jto+e {U - to) 



to-e 



t~ e 



~, , u ,.^{Fo{to)-Fo{t)}{\-{Fo(to)-Fomdt. 

(4.10) 

where Wito) is defined by (4.9). 

5. The maximum likelihood estimator 

As mentioned in section 1, the (nonparametric) maximum likelihood estimator 
(MLE or NPMLE) maximizes the (partial) log hkelihood 

n 

^ {A,i logF(r,) + A,2 log {F(C/0 - F{T,)-} + A.g log (1 - F(U,))} , 

i=l 
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where the maximization is over all distribution functions F . For the non-sep- 
arated case the following conjecture was given in [5] (the lecture notes of a 
summer course given at Stanford University in 1990), which later appeared as 
part 2 of [10]): 

Theorem 5.1. (Conjecture in [5]) Let Fq and H be continuously differen- 
tiable at to and {to, to), respectively, with strictly positive derivatives /o(io) o-nd 
h{t(),t()), where H is the distribution function of {Ti,Ui). By continuous dif- 
ferentiability of H at (to, to) is meant that the density h{t,u) is continuous at 
(t,u), if t < u and {t,u) is sufficiently close to {to, to), and that h{t,t), defined 



where Z is the last time that standard two-sided Brownian motion minus the 
parabola y{t) — t^ reaches its maximum. 

It was also shown in [5] that Theorem 5.1 is true for a "toy" estimator, 
obtained by doing one step of the iterative convex minorant algorithm, starting 
the iterations at the underlying distribution function Fo] the "toy" aspect is that 
we can of course not do this in practice. In spite of the fact that now more than 
20 years have passed since this conjecture has been launched, it still has not 
been proved. In the simulation section we provide some material which seems to 
support the conjecture, but further research is necessary to settle this question. 

For the separated case one can also introduce a toy estimator of the same 
type and one can again formulate the "working hypothesis" that that the toy es- 
timator and the MLE have the same pointwise limit behavior. Anticipating that 
this would hold, [14] derived the asymptotic distribution of the toy estimator in 
the separated case, under the following conditions. 

(CI) The support of Fo is an interval [0, M], where M < oo. 
(C2) Fo and H have densities fo and h w.r.t. Lebesgue measure on M and JR^, 
respectively. 

(C3) Let the functions fci.e and fc2,e be defined by 




is continuous at t, for t in a neighborhood of to- 

Let < Fo{tQ),H{to,to) < I, and let F„ be the MLE of Fo. Then 



(nlogn)i/3 {fM - Fo{to)} / { |/o(to) VM^o, io)}' 



2Z, 




and 




Then, for i = 1, 2 and each e > 0, 




ki{u, ea) du — 0. 
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(C4) < i^o(^o) < 1 and < H{to,to) < 1. 

The motivation for these conditions is given in [14] and actuaUy become clear 
from the proof, which is not given here. 

Theorem 5.2. ([14]) Suppose that assumptions (CI) to (C4) hold. Let ki, i = 
1,2, be defined by 

ki{u) — I f [ ' 7^ / and k2{v) — [ ^ ' J du, 



Fo{v)~F„{u) ' Jo Fo{v)~Fo{u) 

and suppose that /o,. 91,52,^1 and k2 are continuous at to, where gi and (72 
are the first and second marginal densities of h, respectively. Moreover, assume 
/o(io) > 0. Then, if Fn^ is the estimator of the distribution function Fq, ob- 
tained after one step of the iterative convex minorant algorithm, starting the 
iterations with Fq, we have 

ni/3{2^(t„)//o(to)}i/3{F,«(<o) -i^o(to)} A 2Z, 

where Z is the last time where standard two-sided Brownian motion minus the 
parabola y{t) = t^ reaches its maximum, and where 

c(. \ 9i{h) . , /, \ I u u \ I 92{ta) 
-f^olioj ^ — -fo[to) 

It is indeed proved in [6] that, under shghtly stronger conditions (the most 
important one being that an observation interval always has length > e, for 
some e > 0), which hold for the examples in the simulation below, the MLE has 
the same limit behavior, using the same norming constants. The expression for 
the asymptotic variance in the separated case is remarkably different from the 
conjectured variance in the non-separated case, which only depends on Fq via 
foita), showing that only the local behavior, depending on the density at is 
important for the asymptotic variance (assuming that the working hypothesis 
holds). 

Note that if (T^, Ui) is uniform on the upper triangle of the unit square, with 
vertices (0, e), (0, 1) and (1 — e, 1), we have: 

2(1- u^e) 2(^;-£) 
(l-e)2 (l-e)2 

and, if Fq is the uniform distribution function on [0, 1], 

21og{(l-u)/e} 21og(^/e) 
ki{u)^ (13^)2 ' ^2(^') = , 

so 

, 2 f 1 - - £ , , f tQ{l^to) \ , to-e \ 
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in this case. If Fq is given by Fq{x) = 1 — (1 — x)^, x G [0, 1], we get: 

C(io) (5.1) 
1 - - e , <o - e 



(1 - e)2 \ Fo(to) 1 - Fo(io) 



2 arctan [ ) + log o e 



2(l-e)2(l-to)^ I V 
+2arctan fi^) - 2arctan + log ^^^^ " + 



I -to J VI -toy \^ e(2-to) /]■ 

(5.2) 

We give some results for the latter model in section 8. 
6. The smoothed maximum likehhood estimator 

Let h be the density of {Ti,Ui), with first marginal density hi and second 
marginal /12, and let 4>t,b,F be a solution of the integral equation (in (j)): 

(j)(u) = dF{u)\kt.b{u) + I '§j^hiu,v)dv 

I Jv>u - F{u) 

(u) - (j){v) 

n[v, u) av 



V'Cu 



F{u) - F{v) 



where 

Fiu){l-Fiu)} 

aF(u) = 



hi{u){l - F{u)} + h2iu)F{u) ' 
and the function kt i, is defined by 

ktMu)^b'^Ki{t-u)/b). (6.1) 

Moreover, let the function Ot^^p be defined by 

et^Au, V, hM = F(.) - F{u) + 1 - F{v) ' 

(6.2) 

where u < v. Then, as in [3] (separated case) and [4] (non-separated case), we 
have the representation 



K{{t-u)/h) d{F,,~Fo){u)= I e^,^pJu,v,5i,S2)dPo{u,v,Sid2) 

't't.b,Fj'^) 



Fo(u)hi{u) du 

^ (u) 

{Ff){v) — Fo{u)}h{u, v) du dv 



Fn{u) 



Fn{v)~Fn{u) 
4't h F (^) 

; ' t A l-Fo{v)}h2{v)dv. 

1 - Fn[v) 
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Table 1 

Estimates of the actual variances var{F„(t)) (times n) and corresponding theoretical 
variances E9^ ^ p^, where b„ = n~^l^ , for sample size n = 1000. The estimates of of the 
actual variances were based on 10, 000 samples of size 1000 from a Uniform{0, 1) distribution 
Fq and a uniform observation distribution H on the upper triangle of the unit square. 



t 


nvar(F„(t)) 




ratio 


0.1 


0.146489 


0.142235 


1.029910 


0.2 


0.262056 


0.255404 


1.026044 


0.3 


0.334990 


0.332985 


1.006019 


0.4 


0.380357 


0.376413 


1.010479 


0.5 


0.399258 


0.390382 


1.022736 


0.6 


0.386292 


0.376340 


1.026444 


0.7 


0.342651 


0.332856 


1.029428 


0.8 


0.261457 


0.255255 


1.024296 


0.9 


0.145304 


0.142129 


1.022338 



For F = Fo we get the integral equation: 

(j){u) ^ dpoiu) Ih^biu) + [ '§-r^Hu,v)dv 

I Jv>u Fo(v) - Fo(u) 



u) av 



v<u Fo{u) - Fo{v) 



Using the theory in [3] and [4] again, we get that the solution (j)t,b,Fo gives as an 
approximation for nvar(F„(i)): 

(l)t,b.Foiuf 



Foiu) 



-hi{u) du 



^ry-^ ^7^^^ '—h{u,v)dudv+ / ' , , h2{v)dv. 

Fo[v)~Fo[u) J l-Fo[v) 

The approximation seems to work pretty well, as can be seen in table 1, where 
we estimated the actual variance for samples of size n — 1000 by generating 
10, 000 samples of size 1000 from a Uniform(0, 1) distribution Fq and a uniform 
observation distribution H on the upper triangle of the unit square. 

As in the papers cited above, we do not have an explicit expression for (t>t.bn,Fo ! 
a picture of (j)t.b„,Fo for Fq the Uniform(0, 1) distribution Fq and 6„ = 
is shown in Figure 3; the function was computed by solving the corresponding 
matrix equation on a 1000 x 1000 grid. Note that we apply the smooth functional 
theory of the above mentioned papers (which is also discussed in [6]) not for 
a fixed functional, but for changing functionals on shrinking intervals (in the 
hidden space). The reason we can do this is that the bandwidth b is chosen to be 
of a larger order than the critical rate n^^/'^, and that then a different type of 
asymptotics sets in, with asymptotic normality, etc., instead of the non-standard 
asymptotics of the MLE itself. This method is also used in [8] , for the current 
status model. 



imsart-ejs ver. 2011/12/01 file: interval.tex date: December 13, 2011 



p. Groeneboom and T. Ketelaars/ Interval censoring 



21 



0.5 - 




Fig 3. The function u ^ <t>t,br,,Fo{u), u e [0,1], for t = 0.7, 6n = n'"^/^, n = 1000, the 
Uniform{0, 1) distribution Fq and a uniform observation distribution H on the upper triangle 
of the unit square. 

In analogy with Theorem 4.2 in [8] we expect the following result to hold, 
using the conditions on the underlying distributions, discussed in [3] and [4]. To 
avoid messy notation, we will denote the smoothed MLE by Fn instead of i^*^^ 
in the remainder of this section. 

Theorem 6.1. [Conjectured] Let the conditions of Theorem 1, p. 212, in [3] 
(separated case) or Theorem 3.2, p. 647, in [4] (non-separated case) be satisfied. 
Moreover, let the joint density h of the joint density of [Ti, Ui) have a continuous 
hounded second total derivative in the interior of its domain and let /q have a 
continuous derivative at the interior point t of the support of fo, and let Fn be 
the smoothed MLE, defined by (1.3). Then, if bn i< n^^l^ , we have 

- Fo{t) - \blf'^{t) J u^K{u) dtij I Hn^N (0, 1) , n ^ oo, 

where N(0, 1) is the standard normal distribution and cr^^ is defined by 

ol = ^^0t.6„,fo (ri,C/i, An, , (6.3) 
with Qtf>n,Fo given by (6.2). 

Note that (the conjectured) Theorem 6.1 covers both the separated and the 
non-separated case. Unfortunately, we do not have an explicit expression for 
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(6.3) in Theorem 6.1 at present. The functions <Pfoi defining the function Op^^ 
and hence also the variance cr^, are of a rather different nature for the separated 
case and the non-separated case. For an example of this, see Figure 4. 
The variance cr^ can be estimated by 



dt.b F it,U,Si,S2)dPniu,V,Si,S2), 



where 



b„ fAu,'", 61,62) 



l-F^{v) 

and 4>tbF solves the integral equation 

(j){u) ^ dp^f^^^^{u){kt,bAu) + I — |r7T^"("'^)'^^ 



F^{v)-F^{u) 

u) - (j){v) 



Fn{u) — Fn{v) 

and where /i„ is a kernel estimate of the density and where 

d (U) = Fr,{u){l~h{u)} 

hnl{u) = / hn{u,v)dv, hn2{u) = / hn{v,u)dv. 



hn(v,u)dv>, (6.4) 



For hn chosen as in the theorem, the distribution function F„ will be strictly 
increasing with probability tending to one. Since F„ is also continuously differ- 
entiable, the equation (6.4) will have an absolutely continuous solution (p^^ p , 
and we do not have to take recourse to a solution pair, as in [4], which deals 
separately with a discrete and absolutely continuous part. 

In the corresponding result for the current status model we have explicit 

expressions, and we briefly discuss the analogy here, using a notation of the 

~ ((J s) 

same type. Let Fn be the smoothed MLE for the current status model, 
defined by (1.3), but now using the MLE Fn in the current status model. In this 
case the function Ot^t.F, representing the functional in the observation space, is 
given by 

&lbJi^A) = F{u}~ 1 _ F{u) '"^(O'l)- (6-5) 

where 4> is given by: 

(cs). . Fiu){l-F{u)} ,^ 
<^t,6,FW = -^^^ h.bW, 
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Fig 4. The function u i-^ 4't,b,Foit — bu), u e [—5,5], for t = 0.5, b = 0.1, the Uniform{0, 1) 
distribution Fq and (non- separated case:) a uniform observation distribution H on the upper 
triangle of the unit square (solid curve) and the function u i— > 'f>t,b,Fo (t—bu) for the (separated) 
case where the observation distribution H is uniform on the triangle with vertices (0, e), (0, 1) 
and (1 — €, 1), where e = 0.2 (dashed). 



and kt.b is defined by (6.1). Moreover, g is the density of the (one-dimensional) 



observation distribution. The solution 
nvar(F„(t)): 



/,(CS) 



gives as an approximation for 



Fo{u) 



g{u) du 



^'^^ ^ -g{u)du 



-'t.b„,Fo 



1-Foiu) 



Fo{u){l - Foiu)}h^tJu)^ ^ Foit){l - Fojt)} . , 
— du — / K{u) du, h„ 



Moreover, 



limbEe 

HO 



t,b,Fo^ 



9it) 



so in this case we obtain the central limit theorem 



F„{t) - Fo{t) - \blf'^(t) I u^K{u) du 



V 



N{0,l),n-> cx). 
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where 

see Theorem 4.2, p. 365, [8]. 

Remark 6.1. It is tempting to think that the asymptotic variance can be found 
for case 2 by computing 

limbE9t,b,Fo (Ti,t/i,An,Ai2)', 

64-0 

just as in the current status model. However, numerical computations suggested 
that bEO^^p^ tends to zero in the non-separated case. This might mean that 
the variance is not of order n^^/^ in this case, but perhaps contains a logarith- 
mic factor, in analogy with the variance (nlogri)^^/"^ for the histogram-type 
estimators, like Birge's estimator and the MLE without smoothing. 

However, we do not expect this to happen for the separated case. All this still 
has to be determined by the analysis of the difference in asymptotic behavior 
of the functions (j)t,b„,Fg for the separated and non-separated case (see Figure 4 
for a picture of the rather different behavior of 4't,b^,Fo in these two situations). 

7. Simulation results for the non-separated case 

In tables 2 to 5 we present some simulation results for the "non-separated case" 
for both Birge's estimator, the MLE and the smoothed MLE. In all cases the 
observation density was the uniform density on the upper triangle. All results 
are based on 10,000 pseudo-random samples. For Birge's estimator the asymp- 
totically optimal binwidth was chosen in all simulations. 

We study the case where /o is the uniform density on [0, 1] and give results 
for the interior points to — 0.3, 0.4, 0.5 and 0.6. Although these points are 
somewhat arbitrarily chosen, the results are representative for what happens in 
the interior of the interval. 

It can be seen from the tables that the squared bias for the MLE is, in all 
cases, negligible compared to the variance. We note that this is in contrast with 
Birge's estimator. Moreover, the variance of the MLE is generally smaller than 
that of Birge's estimator. Table 5 shows, not unexpectedly, that the MSE of 
the smoothed MLE is much smaller than the MSE of cither the MLE or Birge's 
estimator. 

8. Simulation results for the separated case 

For the separated case the results of a simulation study are provided in the tables 
6 to 14. We first take again Fq to be the uniform(0, 1) distribution function. On 
the other hand, we chose the observation density defined by (4.8), with e = 0.1, 
so the observation times Ti and Ui cannot become arbitrarily close. The results 
are based on 10,000 pseudo-random samples. As in the non-separated case, the 
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Table 2 

MSE for Birge's estimator and MLE, times (nlogrt)^/'^, to = 0.3, 0.4, 0.5 and 0.6, 
non-separated case. The asymptotic MSE of Birge's estimator and the conjectured MSE of 
the MLE are displayed in bold type. 





to = 


0.3 


to = 


0.4 


to = 


0.5 


to = 


0.6 




Birge 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 




1.01 


0.55 


0.99 


0.55 


0.98 


0.55 


0.99 


0.55 


n = 1000 


1.10 


0.50 


1.10 


0.55 


1.09 


0.55 


1.11 


0.55 


n = 2500 


1.06 


0.52 


1.08 


0.54 


1.07 


0.55 


1.06 


0.53 


n = 5000 


1.05 


0.50 


1.03 


0.54 


1.04 


0.56 


1.03 


0.53 


n = 10000 


1.03 


0.51 


1.02 


0.54 


1.00 


0.54 


1.06 


0.54 



Table 3 

Variance for Birge's estimator and MLE. inn.es (ri log n)^/'^, to = 0.3, 0.4, 0.5 and 0.6, 
non-separated case. The asymptotic variance of Birge's estimator and the conjectured 
asymptotic variance of the MLE (MLE) are displayed in bold type. 





to = 


0.3 


to = 


0.4 


to = 


0.5 


to = 


0.6 




Birge 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 




0.67 


0.55 


0.66 


0.55 


0.66 


0.55 


0.66 


0.55 


n = 1000 


0.79 


0.50 


0.78 


0.55 


0.78 


0.55 


0.79 


0.55 


n = 2500 


0.75 


0.52 


0.75 


0.54 


0.74 


0.55 


0.73 


0.53 


n = 5000 


0.74 


0.50 


0.71 


0.54 


0.73 


0.56 


0.72 


0.53 


n = 10000 


0.69 


0.51 


0.69 


0.54 


0.69 


0.54 


0.72 


0.54 



Table 4 

Squared Bias for Birge's estimator and MLE, times (n log n)'^/'"', to = 0.3, 0.4, 0.5 and 0.6, 
non-separated case. The asymptotic squared bias of Birge's estimator is displayed in bold 

type. 







to 


= 0.3 




to 


= 0.4 




to 


= 0.5 


to 


= 0.6 








Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birge 


MLE 






0.34 




0.33 




0.33 




0.33 




n = 


1000 


0.31 


3.4- 10" 


-4 


0.32 


1.6 • 10" 


4 


0.31 


2.4 • 10-5 


0.32 


1.3- 10- 


4 


n = 


2500 


0.31 


1.3- 10- 


-4 


0.32 


8.4 • 10- 


6 


0.33 


7.9 • 10-6 


0.33 


5.6- 10- 


7 


n = 


5000 


0.30 


5.5 • 10" 


-7 


0.32 


1.6 • 10- 


4 


0.31 


2.5 • 10-4 


0.31 


3.6- 10- 


4 


n = 


10000 


0.34 


6.3- 10- 


-5 


0.33 


4.1 • 10- 


6 


0.31 


4.1 • 10-6 


0.34 


8.2 • 10- 


5 



Table 5 

MSE of SMLE divided by MSE of MLE, to = 0.3, 0.4, 0.5 and 0.6, non-separated case. 





to = 0.3 


to = 0.4 


to = 0.5 


to = 0.6 


ratio 


ratio 


ratio 


ratio 


n = 1000 


0.247 


0.262 


0.265 


0.263 


n = 2500 


0.217 


0.236 


0.236 


0.233 


n = 5000 


0.203 


0.219 


0.224 


0.216 


n = 10000 


0.187 


0.197 


0.204 


0.201 
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Table 6 

MSE for Birge's estimator divided by its asymptotic value, tg = 0.3, separated case. 



n = 10^ 


Mt) = 1 


/oW = 4{l-t)3 


1.12 


1.09 


n = 10"^ 


1.04 


1.04 



Table 7 

MSE for Birge's estimator and MLE, times ri?l^ , to = 0.3, 0.4, 0.5 and 0.6, separated case. 
The asymptotic MSE (Birge) and "the asymptotic variance" (MLE) are displayed in bold 

type. 





to = 


0.3 


to = 


0.4 


to = 


0.5 


to = 


0.6 




Birge 


MLE 


Birge 


MLE 


Birge 


MLE 


Birge 


MLE 




0.34 


0.12 


0.32 


0.13 


0.31 


0.13 


0.32 


0.13 


n = 1000 


0.58 


0.14 


0.57 


0.15 


0.56 


0.15 


0.57 


0.15 


n = 2500 


0.44 


0.13 


0.46 


0.14 


0.49 


0.14 


0.48 


0.14 


n = 5000 


0.52 


0.13 


0.48 


0.13 


0.50 


0.14 


0.50 


0.13 


n = 10000 


0.46 


0.12 


0.48 


0.13 


0.49 


0.14 


0.49 


0.14 



MSE of the MLE turns out to be smaller than the MSE of Birge's estimator. 
Here the difference is however even more noticeable. 

In the tables 7 to 9 we give the results for the MSE, variance and squared bias 
for both estimators. Again it can be seen that the variance of Birge's estimator is 
generally larger than the variance of the MLE. Moreover, as in the non-separated 
case, the squared bias for the MLE is, in all cases, negligible compared to the 
variance. 

To show that the results are not specific for the uniform distribution, we 
give in the tables 11 to 13 the corresponding comparisons for the distribution 
function Fq, with density /o, defined by 

F^{x)^l-(l~x)\ h{x)^A[l-x)\ xe[0,l]. 

For the computation of the asymptotic variance of the MLE we used (5.1) of 
section 5. It is seen that the correspondence between the asymptotic expression 
for the variance and the actual sample variance of the MLE is rather good, 
and also that the superiority of the MLE w.r.t. Birge's estimator is still more 
pronounced for this distribution function. Table 14 shows that the ratio of the 
MSE of the SMLE and the MSE of the actual MLE is somewhat larger here, 
which is probably due to the fact that the asymptotic bias plays a larger role for 
the SMLE in this case (this bias vanishes for the uniform distribution function). 
The bias of the actual MLE is again very small for this distribution function, 
however. 

As the fit with the asymptotic MSE was not satisfactory for Birge's estimator 
in the separated case, we also did some simulations for much larger sample 
sizes. It turns out that the MSE then approximates the values predicted by the 
asymptotic theory. Some evidence is given in table 6. The results are based on 
1000 pseudo-random samples. 
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Table 8 

Variance for Birge's estimator and MLE, times n^i'^ , to = 0.3, 0.4, 0.5 and 0.6, separated 
case. The asymptotic variances are displayed in bold type. 





to = 


0.3 


to = 


0.4 


to = 


0.5 


to = 


0.6 




Birge 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 




0.23 


0.12 


0.21 


0.13 


0.20 


0.13 


0.21 


0.13 


n = 1000 


0.46 


0.14 


0.47 


0.15 


0.46 


0.15 


0.47 


0.15 


n = 2500 


0.35 


0.13 


0.36 


0.14 


0.39 


0.14 


0.37 


0.14 


n = 5000 


0.42 


0.13 


0.38 


0.13 


0.40 


0.14 


0.39 


0.13 


n = 10000 


0.36 


0.12 


0.37 


0.14 


0.39 


0.14 


0.39 


0.14 



Table 9 

Squared Bias for Birge's estimator and MLE, times n^l^ , to = 0.3, 0.4, 0.5 and 0.6, 
separated case. The asymptotic squared bias (Birge) is displayed in bold type. 





to 


= 0.3 


to 


= 0.4 


to 


= 0.5 




to 


= 0.6 




Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 




0.11 




0.11 




0.10 




0.11 




n = 1000 


0.11 


2.3 ■ 10"^ 


0.10 


1.1 • 10-6 


0.10 


1.3 • 10" 


b 


0.10 


1.2 • 10-'' 


n = 2500 


0.09 


5.1 • 10-6 


0.10 


1.7- 10-5 


0.09 


3.1 • 10- 


6 


0.12 


2.0 • 10-" 


n = 5000 


0.11 


4.0- 10-8 


0.09 


2.6 • 10-6 


0.10 


5.9- 10" 


-5 


0.11 


1.6 • 10-6 


n = 10000 


0.10 


3.2 • 10-5 


0.11 


2.1 • 10-6 


0.10 


1.0- 10" 


-6 


0.10 


4.6- 10-6 



Table 10 

MSB of SMLE divided by MSB of MLE, to = 0.3, 0.4, 0.5 and 0.6, separated case. 





to = 0.3 


to = 0.4 


to = 0.5 


to = 0.6 


ratio 


ratio 


ratio 


ratio 


n = 1000 


0.258 


0.272 


0.274 


0.268 


n = 2500 


0.230 


0.244 


0.243 


0.244 


n = 5000 


0.219 


0.225 


0.225 


0.219 


n = 10000 


0.199 


0.201 


0.206 


0.203 



Table 11 

MSE for Birge's estimator and MLE, times n^/s, /o(t) = 4(1 - t)^, t e [0, 1], 
to = 0.3, 0.4, 0.5 and 0.6, separated case. The asymptotic MSE (Birge) and the asymptotic 
variance (MLE) are displayed in bold type. 





to = 


0.3 


to = 


= 0.4 


to = 


= 0.5 


to = 


= 0.6 




Birgc 


MLE 


Birge 


MLE 


Birgc 


MLE 


Birgc 


MLE 




0.41 


0.15 


0.24 


0.081 


0.14 


0.037 


0.08 


0.014 


n = 1000 


0.53 


0.16 


0.39 


0.088 


0.21 


0.041 


0.101 


0.016 


n = 2500 


0.61 


0.16 


0.33 


0.087 


0.25 


0.039 


0.100 


0.015 


n = 5000 


0.56 


0.16 


0.36 


0.083 


0.18 


0.038 


0.101 


0.014 


n = 10000 


0.49 


0.15 


0.36 


0.082 


0.22 


0.037 


0.120 


0.014 
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Table 12 

Variance for Birge's estimator and MLE, times r?l^ , to = 0.3, 0.4, 0.5 and 0.6, 
/o(t) = 4(1 — t)^, t 6 [0, 1], separated case. The asymptotic variances are displayed in bold 

type. 





to = 


0.3 


to = 


= 0.4 


U, = 


0.5 


to = 


0.6 




Birge 


MLE 


Birgc 


MLE 


Birgc 


MLE 


Birgc 


MLE 




0.28 


0.15 


0.16 


0.081 


0.091 


0.037 


0.051 


0.014 


n = 1000 


0.41 


0.16 


0.32 


0.087 


0.16 


0.040 


0.070 


0.016 


n = 2500 


0.48 


0.16 


0.25 


0.087 


0.19 


0.039 


0.063 


0.015 


n = 5000 


0.43 


0.16 


0.28 


0.082 


0.13 


0.038 


0.070 


0.014 


n = 10000 


0.35 


0.15 


0.28 


0.082 


0.17 


0.037 


0.090 


0.014 



Table 13 

Squared Bias for Birge's estimator and MLE, times n^/^, fo{t) = 4(1 — t)^,f e [0, 1], 
to = 0.3, 0.4, 0.5 and 0.6, separated case. The asymptotic squared bias (Birge) is displayed 

in bold type. 





to 


= 0.3 




to 


= 0.4 




to 


= 0.5 




to 


= 0.6 




Birge 


MLE 


Birge 


MLE 


Birgc 


MLE 


Birge 


MLE 


0.14 




0.079 




0.045 




0.025 




n = 


1000 


0.12 


1.1 • 10" 


4 


0.076 


1.6 • 10" 


-4 


0.051 


2.2 ■ 10" 


4 


0.030 


3.2 • 10" 


4 


n = 


2500 


0.13 


3.2 ■ 10" 


5 


0.080 


2.3 ■ 10" 


-4 


0.054 


2.0 ■ 10" 


4 


0.037 


1.1 ■ 10" 


4 


n = 


5000 


0.13 


1.4- 10" 


6 


0.075 


3.0 • 10- 


-4 


0.048 


1.0 ■ 10- 


4 


0.031 


8.7- 10- 




n = 


10000 


0.13 


4.8 • 10" 


5 


0.079 


1.4 • 10" 


-4 


0.049 


1.1 • 10" 


4 


0.030 


8.0- 10" 


5 



Table 14 

MSB of SMLE divided by MSE of MLE, times n^/s, /o(t) = 4(1 - t)^, t G [0, 1], 
to = 0.3, 0.4, 0.5 and 0.6, separated case. 





to = 0.3 


to = 0.4 


to = 0.5 


to = 0.6 


ratio 


ratio 


ratio 


ratio 


n = 1000 


0.439 


0.395 


0.443 


0.435 


n = 2500 


0.372 


0.393 


0.409 


0.424 


n = 5000 


0.350 


0.354 


0.383 


0.391 


n = 10000 


0.312 


0.332 


0.349 


0.389 
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9. Summary 

In the preceding, the Umit distributions of three estimators for the interval cen- 
soring, case 2, problem were discussed: Birge's estimator, the (nonparametric) 
maximum likelihood estimator (MLE) and the smoothed MLE, which is analo- 
gous to the smoothed MLE introduced in [8] for the current status model. Birge's 
estimator is mainly of theoretical interest and constructed to show that the min- 
imax rate can be attained. The construction uses prior knowledge on whether 
the observation distribution has arbitrarily small observation intervals (the so- 
called non-separated case) or not (the separated case). Such prior knowledge is 
not necessary for the MLE, which adapts automatically to either situation. 

The conjectured limit distribution of the MLE in the non-separated case, 
given in [5], was (partially) checked in a simulation study, comparing Birge's 
estimator, the MLE and the smoothed MLE. The simulation study seems to 
support the conjecture. The smoothed MLE converges at a faster rate than 
either Birge's estimator or the MLE on which it is based if the underlying 
distribution is smooth, as is also borne out by the simulation study. 

The limit distribution of the MLE in the separated case was given in [6] and 
the simulation study for the separated case shows that the asymptotic variance, 
arising from this result, provides a good approximation to the actual finite 
sample variance. The difference in behavior for the separated and non-separated 
cases persists for the smoothed MLE and in that case crucially depends on 
properties of the solution of an integral equation, as discussed in section 6. This 
analysis is based on a local version of the theory developed in [2] , [3] and [4] . The 
(numerical) solution of the integral equation can be used to estimate the variance 
of the smoothed MLE. The theoretically computed asymptotic variance, using 
a numerical solution of the integral equation, fits the observed sample variance 
rather well, but the discussion on this matter is heuristic and still contains lots 
of open questions. 



10. Appendix 

We split the proof of Theorem 3.1 into several parts, dealing with the difficulties 
(1), (2) and (3), mentioned in section 3. Here and in the following we will 
use some empirical process notation to make the transition to the asymptotic 
distribution more transparant. As an example, we give a representation of 

in terms of integrals with respect to empirical distributions. First we write: 

W^Nl^ I SidVn{t,u,S), 
Jteik 

where 5 — ((5i, ^2, (^s) is the vector of indicators 

^1 = ^{x<t}, ^2 = l{t<2;<u}, — ^{x>u}, 
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giving the position of the unobservable random variables Xi with respect to 
the observation interval [T^, J7i], and where P„ is the empirical measure of the 
random variables {T^,Ui,Ai) = {T^,Ui, Ai^i, Ai^2, ^1,3)- 

The denominator of (10.1), after dividing by n, is rewritten in the form: 

n-^Nk= [ dG„,i(t) = G„,i(ifc+i)-G„.i(ifc): (10-2) 

where Gn,i is the empirical distribution function of the Ti, with underlying df Gi 
and underlying density gi, which is the first marginal of h. Using this notation, 
we get: 

^ ' Gn,! (^fc+l) — Grn,l (tk) 

where we the define the ratio to be zero if the denominator is zero. The terms 
M^/Mfc and Q'j ^/Qj.k can be rewritten in a similar way. 
We will also use the following decomposition: 

- F„{tk)^ l{JV,>o} 

= -j^^ l{iVfc>o} + l{Wfc>o}- (10.4) 

We similarly have, denoting 1 — Fq by Fq, 



- -Fo(40| l{J\/fe>0} 



Ml~E{Ml\Mk} E{M',~AhFMm} 



and 



^-{Fo(ifc)-Fo(t,)}|l{Q^,>o} 
^ ^1/ 



g., ^{Qi.fc>0} 

E ' 



{Q'j,k - Qj,k {Foitk) - Fo{t,)} \Q,,k} ^ 



n -{Q.,.>o} ■ (10.6) 

One can consider this as a decomposition into a "variance part" and a "bias 
part" , where the first terms on the right-hand sides of the above expressions 
correspond to the variance part and the second terms to the bias part. 
We first deal with the bias part. 

Lemma 10.1. Let the conditions of Theorem 3.1 he satisfied, and let, for each 
interval Ik of the partition, tk — t^,""* he its left boundary point. Moreover, let 
tj = i^"^ — >■ to, and he defined by (3.12). Then we have for Birge's statistic, 
defined by (3.4), 
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(i) As n ^ GO, 



a„ ^ var 



(^E ^^FnitJ) - Foitj) I 7Vfe,Qj,fe, fc> j; M^, Qfe,,, fc < j})' ^ 0. 

(10.7) 



(ii) As n oo, 

a-^E^^K{tj)~ Foitj) I Nk,Qj^k, k> j; A4,Qfej, fc<j} ^ |c/o(io)- 

(10.8) 

Proof. 

(i). If Nk, Affe, Qj.fc and Qkj are strictly positive, for all (relevant) values of k, 
we can write 

E {F„(ij) - Fo(ij) I Nk,Qj^k, k > j; Mk, Qk.j, k < j} 
E{N'^^NkFoitk)\Nk} 



J2 ^J-'^' 

k:k>j 



+ J2 "^3^^ \ 
k:k<j I 



E 



{Q',,k~Q,.k{Fo{tk)-F,itj)}\Q,,k} 



E{Mk-MlJo{tk)\Mk] 



'{Q'k.j-QkAMtk)-Fo{tj)}\Qu,j} 



E- 

Qik,j 

see (10.4) to (10.6). We can write this in the following form: 

E {Fn{tj) - Foitj) I Nk, QjM, k > j; Mk.Qkj, k < jj 



J2 ^i^k 

k:k>j 



J2 ^j.k \ 

k:k<j I 



Sn,l(tfe+l) — Gn,l(tfc) 

Ii, {Foit) - Fo{tk)} dGnAt) 

^naitk+l) — Gn,2(^fe) 
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By expanding in t^. and tj, as in (3.11), we find that this can be written 
E {F„(tj) - i^ofe) I Nk, Qj^k, k > j; Mk,Qk^j, k < j} 



k:k>j 



k:k<j 



foitk) Jj^ it-tk) dGn,iit) 

— Gn^litk) 

Itei„uei, i/ofa) - ^fc) - /ofe) - ^j)} '^^r^it, u) • 
Itei„uei, dMnit,u) 
foitk) 4 it-tk) dGnAt) 



+ o(l/X). 

The remainder term o{\/ K) arises from the fact that we can write, for example, 
//, {Foit) - Fojtk)} dGnAt) _ foitk) 4 it - tk) dGnAt) ^ ^ - t) o(l) 

Gnsitk+l) — Gn.litk) Gn^iitk+l) — G„.i(ife) 

where tk+i ~ tk < ^/K, and the o(l)-factor is uniform in fc, by the uniform 
contiuity of /o on [0, 1]. A similar expansion is used for the other terms, and the 
oil/K) remainder term now surfaces from the fact that the weights Wj^k sum 
to 1. 

Furthermore, if j < k, and tk,tj e [e, 1 — e], for some e G (0, 1/2), we get: 



■'''^1 Gnal^fe+i) - Gn,i(^fe) 'G„,i(tfc+i) — G„j(ife) 1 {'^kyoyo} 

nfoitk)^ {jj^ it - tk) d (G„,i - Gi) (i)}' 
- ^ (1 + fc - j)^Wf {GnAtk+i) - Gn,iitk)} ^^""^'^ 

foitk)' ^ 1 

^; 

3 



foitk) L it - tk) dGnsit) foitk) Jr it ~ tk) dGiit) 



2 



3/o(ife)' 



n(l + fc - j)^giitk) {a(to) + 5(io)}' (logn)2 

where we use (3.10) and exponential inequalities of the type discussed in the 
proof of Lemma 3.1 below for the probability that 

\Gn,iitk+i) ' Gn,iitk) - Giitk+i) + Giitk)\ > e. 
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We similarly get, for all k > j, 

^'"3,k { 



{/ofa)^ + /ofe)^} {1 + 0(1)} 
3n(l + k- j)^h{tj,tk) {a{to) + b{to)}^ (logn)2 



kQi,k>o} 



with an analogous upper bound for the terms, involving Qk,j,, with k < j, and, 
finally, if fc < j, and tk, tj G [e, 1 — e]. 



Ew 



Mtk) 4 {t-tk) d{Gn,2-G2){t)\ 



2 



< 



3/ofa)^{l + o(l)} 



n{l + k- jfg2{tk) {a{to) + 6(to)} (logn 
The terms for tfc > 1 — e are treated by using 

^wlkUtk)\ Mtkfkitk? 



K^^k - j + 1)2 {a{to) + b{to)f (logn)2 

~ K^itk - tjY {a(to) + 6(io)}' (logn)2 ' 
with a similar upper bound for tk < e and 

I G„,2(ifc+i) - G„,2(tfe) J 
We also have, for example, if fc' > fc > j, 

{Wi,kfo{tk) /r {t - tk) (i(G„.i - Gi) {t) 
k 
'Gn,l{tk+l) ~ 'Gn.litk) 

'Wj,k'fo{tk') Jj^, [t - tk') d{Gn,i - Gi) (t) I 

Gn,l{tk' + l) — 'Gn,l{tk') j 

9a{tk)a{tk') 

~ AnK^aito) + b{to)V{k -j + l){k' - j + l)(logn)2 ' 

and the expectation of other cross-product terms can be treated similarly. 
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Combining these results, we find that the variance of the conditional expec- 
tation 

a-^E {Pnitj) ~ Fo{tj) I Nk,Qj,k, k > j; Mfc, Q^j, k < j| 

is of order 0(l/{n(logn)^a^}) — o(l). 
(ii). We have, if tk G [e, 1 — e], 

4 {Fojt) - Fojtk)} rfGn,i(t) ^ ^cVofa) ~ tfc)^ ffifa) {1 + Op(l)} 

Gn,l(ife+l) - Gn,l(tfc) Cgi{tk) {tk+1 - tk) {1 + Op(l)} 

= Icfoitk) (tk+1 - tk) {I + Op{l)} , 

and similarly, 

^^T^^ — 77 ^ — ^ — zr\ = ^cfoitk) [tk+i - tk) {1 + Op(l)} , 

'^n,2[tk+l) — 'iin,2(,Ifej 

Moreover, if fc > j, 

^ l^ {/ofa) fa+l - tk) - /oftj) (tj + l - tj)} h{tj,tk) {1 + Op(l)} 
/l(i,,ifc){l + Op(l)} 

= 5c{/o(tfe) {tk+1 - tk) - hit,) {t,+i - tj)} {1 + Op(l)} , 

with a similar expansion for k < j. The Op (l)-terms are uniform in /c, as follows 
by using exponential inequalities of the same type as used in Lemma 3.1. 

It is easily seen that the terms, involving values of tk ^ [^i 1 ^ f] give a 
negligible contribution, by noting that 

foitk)JjJt~tk)dG^,i{t) ^ ^^^^^^ 

— 7^ 77 7 ^ Tr\ - JO[tk) (tk+1 - tk) , 

''^n,l[tk+l) — ''^n,l\tk) 

if fc > j, with a similar upper bound if tk < tj. The results now follows by 
multiplying with wj^k a-nd summing over k, see (3.11). □ 



We now define 

U^,k ^ n-' {Nl - E {N'^\Nk}} , (10.9) 

and 

= n-' {A4 -E{M',\Mk}} . (10.10) 

Note that these are the numerators of the "variance parts" in (10.4) and (10.5), 
divided by n. The following lemma shows that (in the proper scaling for Birge's 
statistic) the variances of the sums of terms, involving Un,k and Vn^k in Birge's 
statistic, tend to zero. 
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Lemma 10.2. Let the conditions of Theorem 3.1 he satisfied, let tj ~ tg, and 
let an be defined by (3.12). Moreover, let Un,k o,nd Vn,k be defined by (10.9) and 
(10.10). Then, asn^oo, 



a„ ^var 



k^j '^"'-^ (^fe+l) ^ '^"4 (^fc) '^ri,2 (ife+l) ^ '^n,2 {Ik) 



Proof. We have: 

Wj^kUnM \ - Wj,kVn^k 



var 



EuJj,k'-Jn,k \ " 



El Wj.kUn.k \ , \ ^ f Wj,kVn.k 

var — — z — r + > var 



k-k>j ^ (^'=+1-' ^ "^"^1 ^ fc'^j \'^n,2 (i/c+l) ^ G'„^2 (^fe) 

since the covariances of the terms in the sum are zero. As before, we define the 
ratios to be zero if the denominator is zero. 
Furthermore: 

Wj^kUn,k ^ — E ' U)j^kUn,k 



n,l (Ik + l) — (tk) J \Gn,l (Ik + l) — Grn,l {tk) 

since E Wj^kUn,k/{Grn,i (ifc+i) — Gn,i (tk)} = 0. Noting that the weights Wj^k 
have upper bound 

^/n{Gn,i (tk+i) - Gna (ifc)} 
{k-j + l)W, ' 

we now obtain: 

_2 I Wj,kUn.k 



< a„ "^E 



= a-'^E 



ri,l (^fc+l) ^ 'Gn,! (tk) 



{k-j + 1)2 {G„,i (ifc+i) - G„,i (tk)} W] 

^2 



a. 



" (fc ~ j + 1)2 {G„a (tfc+i) - G„,i (tfe)} 
-Hmk){l-F^{tk))+o{l)} 



where (as before), 

Un,k/ {GnA (tk+l) — G„^i {tk)} 0, 

ifG„,i(tfc+i)-G„,i(tfc) = 0. 
By (3.10): 

E{i/wf} i{^^>o} - . ,^,;^n2n ^ n-^/^(logn)-^/^ (10.11) 

n{a(io) + o(to)} (logn)^ 
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So we obtain 



-2 / Wj]JJn,k 

> var ' 



E 



+ E 

k:k<ij 

0(1/ logn) . 



n,l (i/c+l) ~ "En,! (^fe). 



*G„^2 (^fe+l) — Gn,2 {tk) 



□ 



We now define, if j < k, 

Wn.j,k = (logn) {q;-, - E {Q'^JQ.^k}} , (10.12) 

and, if j > fc: 

W^„,,-fe = (logn) {Ql, ~ E {Q',JQk,j}} . (10.13) 

Lemma 10.2 suggests that if (nlogn)^/'^{i^„(io) — ^o(io)} has a nondegenerate 
distribution, this has to come from the sum: 

The following lemma shows that (10.14), with the random weights Wj,k replaced 
by the deterministic weights Wj^k indeed has a nondegenerate limit distribution. 

Lemma 10.3. Let the conditions of Theorem 3.1 he satisfied, let tj — to. More- 
over, letWn,j,k be defined by (10.12) and (10.13). Then: 

where the right-hand side denotes a normal random variable, with expectation 
and variance ctq, defined by (3.14) in Theorem 3.1. 

Proof. We will prove the result by constructing a martingale-difference array, 
and applying Theorem 1, p. 171 of [12]. Define, for k > j, the random variables 

f _ _ ~ Wn,j,k 
C,n.k — 



'^C^hitj,tk)' 

For k < j we define 

^"''""'^■''c2/i(i,,i^.)' 

and (for notational convenience) we define = 0. 

Let the increasing sequence of cr-fields Tn.k, fc = 0, 1, . . . be defined by 

J^nfi = 0, J^nM = o- {(7i, Ui,Ai) , Ti < tfe+i, e Ij} , k <j, 
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and 

J^n,k = <y {{Ti, Ui, Ai) , Ti e Ij, Ui < tk+i} ,j<k, 

where Aj = (A^^, Ai,2, Aj^a), as before. Note: Ik = [tk,tk+i), k < K, and 
Ik = [tK,tK+i], under scheme (i), and 4 = [^^,^^+1), k < K, and Ik+i = 
[tx+i , iK+2\ under scheme (ii) at the beginning of this section. 
Then: 

E {U,k I J'n,k-i] = 0, fc = 1, 2, . . . (10.15) 

Here and in the fohowing the indices k run from 1 to or to ii" + 1, depending 
on whether scheme (i) or (ii) holds, respectively. 
Note that, if fc < j, we can write 

Wn,j,k = nlogn / {^2 - {Fq{u) - Fo(i)}} dP„(t,«,5) 

J(t,M)6/fcX/j 

n 

= logn^ {A2,i - {Fo{u) - Foit)}} 1{t,£7„ c/,e/,} • 

i=l 

and that 

E{A2,, - {Fo(f/,) - Fo(7i)} I ^n,k) = 0, 

if tk < Ti < tj and Ui € Ij , using the independence of the Xi from the pairs 
{Ti, Ui). Similar relations hold if ti G Ij. This implies 

E{(n^k I Tn,k-l} =0,k^l,2,... . (10.16) 

It is also clear that S,n,k is measurable with respect to J^n,k- 
Let the conditional variances Vn,k be defined by 

Vn,k = E {^Ij, I Tn,k-l} , fc = 1, 2, . . . . 

We first consider the indices fc such that 

\j -k\< SnK, 

where e„ = (logn)^^/^. We then get, if fc < j, 

Vn,k 

Wj,,n(\ogn)'^ 

■Elf {Fo{u) - Fo{t)} {1 - Fo{u) + F^{t)} dH„(t, u) \ T-a^k-i \ 

^ wl^n^l\\ognYI\t^ - tk)Mto) {1 + 0^(1)} 

^ 9^(^o)^(nlogn)i/3(j _ fc + l)/o(to) {1 + 0^(1)} 

c'Kh{tk,to){j -k + l)nogn 
^ 9b{to)^fo{to){l + 0p{l)} 

c{a(io) + b{to)f h{tk,to){j - fc + l)logn ' 
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We similarly get: 

^ 9a(to)Vofa){l + Op(l)} 

c{a{to) + h(tQ)f h{tk,to)ij - /c + 1) logn ■ 

if fc > J and k — j < enK. The terms where |fc — j| > enK, give a negUgible 
contribution, since 



= Op ((log n)-2/3' 



usmg 

E »»-°f E (,-.)',iog„)0 -°(""^'"°«"'i' 

k:j-k>e„K \k:j-k>e„K J \ e> J 

as n — 00. So we find 

^ ^ fT^ , (10.17) 

since ^ ^ 

7 ~ 7: log n, n — 00. 

m+1 3 ^ 

To get asymptotic normahty, it only remains to show that the Lindeberg-type 
condition 

^i?{e^,fel{|?„,.|>e} I ^0, (10.18) 

holds for each e > 0, since in that case both conditions of Theorem 1 of [12] are 
satisfied. To this end we use the conditional Cauchy-Schwarz inequahty 



E iCk^Ui^MX^} I ^n,k-l} < y£^{Cfc I ^n.k~l] E {l{\^^ „\>e} \ J'n.k-l} ■ 

(10.19) 

Note that: 

^{l{|6..fc|><^} I J^n,k-i} < e"^S{Clfc I J^n.k-l} = = Op(l/logn), 

(10.20) 

as n — >■ 00. Using again the conditional independence of the Xi, given the values 
of the pairs (T,, Ui), and defining pQ{t, u) = Fq{u) — FQ{t), pQ{t, u) — l—pQ{t, u), 
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we get, li k < j: 
■u5^_j,n(logn)'* 



4 



■El / pa{t,u)pf^{t,u)}{poit,uf +Poit,u)^} (mn{t,u)\ Tn,k- 

[Jteik,u£ij 



■Ellf po{t,u){l-p„{t,u)}dMnit,u)\ J-„,fc_i I . (10.21) 

[ Uteik.neij ) J 

The first conditional expectation on the right-hand side of (10.21) arises from 
terms of the form 

E {{A2,, - {Fom - FoiT,))}" I , 

where Ti £ Ik, Ui € Ij, and the second one from terms of the form 

E {{A2,. - (Fo(C/,) - Fo(T,))}' - (FoiU,,) ~ FoiT,,))f \ F„,k~i} , 

where i ^ i' and Ti,Tii S Ik] Ui, Ui' € Ij, where we added the diagonal terms 
(where i = i') for simplicity of notation, since they give a negligible contribution. 
The other conditional expectations of crossproducts are zero. If fc > j we get an 
entirely similar expansion, with the roles of t and u interchanged. 

The first term on the right-hand side of (10.21) gives a contribution of order 
Op (l/Vlog n) in the summation of the terms 



^{^t,k I -^^^fe-i} 

over k. The square root of the second term is of order Op{l/{\j — fcjlogn}), 
if \j — A:| < enK, which leads to a contribution of order Op(l) in the above 
summation. The part where |j — k\ > e„K is again negligible. 
So we get, using (10.19) and (10.20), 

K K 



k=l fc=l ' 

= Op (l/v/l^) . 



□ 
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Proof of Theorem 3.1. 

ad (i). Lemma 10.2 shows that the terms involving Nj^/Nk and AI'^/Mk only 
give a contribution to the asymptotic bias of Birge's statistic, but not to the 
limit distribution of the centered part. The limit distribution of the centered 
part therefore arises from the terms Wn,j,k, where 



W„j-fe = nlogTi / {,^2 - {Fo{u) - F^{t)}} dV,,{t,u,5), 

J{t,u)eikXi] 

which are the numerators of the fractions 

Qj,k 

_ nlogn/(, „)g,^,,^. {^2 - {Foiu) - Foit)}} dP.Jt.u,S) 

Now note that 

(nlogn)2/3 f dMr,{t,u) ^ c'h{tj,tk) {1 + Op{l)} . 

J {t,u)eij x/fc 

where the Op(l)-term is uniform in k by the results, given above. Moreover, by 
part (i) of Lemma 3.1, 



~ ^ Wj,k ~ hOp(l)2_^ 

k^j ^^Htj,tk) C^h{tj,tk) f^_,j 



where ^ ^ 

h{t,u) = h(t,u), t < h{t,u) — h(u,t), t > u. 

The result now follows from Lemma 10.3. 

ad (ii). We first prove (3.15). Since EFn{tj) is the expectation of 

E |F„(tj) I Nk,Qj,k, k > j; A/fc, Q^j, /c < j| , 
part (i) of Lemma 10.1 tells us that 

a;,^E |S [h{tj) I Nk,Q,,k. k > j; A4, Q^,,, k < - ^ 0, 

as n — )• oo. This implies: 

[e {P^it,) I Nk,Qj.k, k > j; Mk, Qk.j, k < - EFn{tj)} 0, 
as n — oo. But since, by part (ii) of Lemma 10.1, 

^E^Fnit,) I Nk,Qj,k, k > j; Mk,Qkj, fc < jj - Fo(io)} ^ hMto), 
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as n — >■ cx), we must have: 

0,7^ ^EFn{tj) - Fo(to)} ^cfoito), 71 ex.. 
This yields (3.15). 

To prove (3.16), we first note that, by part (i) of Lemma 10.1, the variance 
of the conditional expectation 

a-^E^^Fnitj)- Fo{to) I Nk,QjM,k>j; Mk,Qk,j, k < 

in the decomposition 

a-' [Kit,) - Foih)} 

= a;;' [Pnitj) - E {F„{t,) I Nk,Qj,k, k > j; M^, Qk.j, /c < j}} 

+ a-^E {F„(t,) - Fo(to) I Nk,Qj,k, k > j; A4, Qk.j, k < jj 

tends to zero. By Lemma 10.2 the sum of terms involving Nk and Mk also gives 
an asymptotically negligible contribution to a^^ ^Fnitj) — ^o(^o)|- 
So we only have to consider the contribution of terms of the form 



Qj,k 



, k> j, 



(10.22) 



and 



a. 



-'^j.k{Q'k.,-E{Q',.\Qk,j)} 



k < j. (10.23) 
The variance of (10.22) is given by 

n{\ogn)^wl^ /(t,u)6/,x/, {^o(") - Fo{t)} {1 - (Foiu) - Fo(t))} dM,,it,u) 



E- 



Lemma 3.1 gives (uniform) exponential inequalities are derived for the proba- 
bilities of the events of the following type: 



A 



def 



(nlogn)2/3 / (m„{t,u)-c'h{t,,tk. 



{t,u)eijxik 



> ec^h{tj,t 
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yielding upper bounds, tending to zero faster than any power of n. So we get: 

r n{logn)^wlJ^^^^^^j^^j^ {Fo{u) - i^oW}{l - {Fo{u) - Foit))} dH„(f,M) 



< 



< {Foit,,,) - ^ (»logn)V3{i + ,(i)} ^ ^^^^^^ ^ 



which tends to zero faster than any power of n, uniformly in k. Here we use the 
lower bound 1/K for Wjl^\Yj>o}- 
So we find: 



E 



nilognfwU^^^^^^j^^j^ {Foju) - Fgjt)} ({1 - (Foju) - Fpit))} dMn{t,u) 

+ o{llK) 

i{\ognfwl, /(t,„)e/,x7, {^oM - i^o(i)} {1 - {Fo{u) - Fo(i))} dH„(t,n) 



> E 



> E 



n{\ognfwl^ /(t,.)6/.x7, {^o(w) - Fo(t)} {1 - (J^oN - F^it))} dm^{t,u) 



{l + e)^c^h{tj,tk) 

+ o{l/K). 

This implies: 

nilogn)^wU^^^^^^^j^^j^ {Foiu) - Foit)} {1 - (Fo(w) - Foit))} dM„(t,u) 



E 



= E 



(nlogn)4/3|/^^_^^^^^^^^ dR^it,u)} 
iilogn)^wl, {Foju) - Foit)} {1 - jFoju) - Fojt))} dH„ft,u) 

c''hitj,tk) 



+ oil/K). 

Now let, for tk < 1 — S, where (5 > 0, the event be defined by 



Bk=Uil + k-j) 



(nlogn)^/^ / dGniu) - cgiitk) > ecgiitk)\ 



For tfc > 1 — 5, we define the event Bk by: 

>ec|, 
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Similarly to what is true for Aj ^, we have that P{B]S) tends to zero faster than 
any power of n, uniformly in k. This shows that we also can replace Wj^k by Wj^k 
in the asymptotic expression for the variance, using the fact that the terms for 
tk > 1 — 5 will give a contribution of lower order in the summation. So we find: 

n(logn)^- 

' ^ ^^^^ 

^ 9nQfa)^£;/(,^^^^g,^^,^. {fo(u) - Fo{t)}{l - [F„{u)-Fo{t)))dW^{t,u) 

^ ij^k c4 {a(to) + b{to)f {j -k + lYh{tj,tk) 

^ V- ^najtkf {Fofa) - fo(tj-)} {1 - (Fpfa) - fpfe))} (nlogn)-^/^ 

^ V- 9na(tj)Voft,-) fa - tj) (n\ogn)~^/^ 



^ 9nafa)2/ofa)(nlogn)-i _ 3a(io)Vofa) 



/£:j>fc 



c{afa) + b{ta)f ij-k + l)h{tj,tk) c{afa) + 6fa)}' /ifa, ife) ' 



Similarly we find that the summation for k < j gives a contribution which is 
asymptotically equivalent to 

36fa)Vofa) 



c{a{to) + bito)}^hit,,tk)' 



This yields (3.16). □ 
Proof of Lemma 3.1. 

We first prove (3.9). By Bennett's inequality (see, e.g., [12], p. 192) we have, for 
e > 0, 



{\Nk/n-ENk/n\ > 
< 2exp 



P 

( 9 



where 

0(x) = 2{(l + x)log(l + .)-.} ^^^^ ^^^^^^^ 

This way of stating Bennett's inequality first appeared in [13]. The function </> 
satisfies lima;^o = 1 ^-i^d 

0(x)>^,x>O, 
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see [12], p. 192, p. 193. 

By the continuity of gi on [0, 1] there exists for each k a & Ik such that 

gi{t) dt = {tk+i - tk} ■ 

Ik 

Hence we get, for each fc, 

= ^"''{'29.(&)(logn)"»*(;:fe)) 

Similarly we get, for each k and points rjk G Ik 

P^^\Mk/n~EMk/n\ > < 2exp j 
Moreover, if j < k, 

P{\Qj,k/n-EQ^_k/n\ > 

I n.f'^ 

< 2 exp 



I 2(72 ) (log n) 1/3 252 fe) 



■^^1 2/i(t„t,)(logn)2/3 '^^^ 2h{t,,tk) A 



with a similar upper bound, if fc < j. 
Let e > 0, let h be defined by 

h{t, u) = h{t, u), u>t, h{t, u) = h{u, t), u <t, (10.25) 

and similarly Qj f. by 

Qj,k{i^u) = Qj^k{t,u), u>t,k>j, Qjj.{t,u) = Qk.j{u,t), {u,t), u <t, k < j. 

(10.26) 

Moreover, let the set Aj ,: be defined by 

Aj,e = \ sup |Qj- Jn - EQ^ Jn\ < sup \Nk/n - ENk/n\ < ^, 

e 



k<j K 



sup|Affe/n-£;Mfc| < 

and let 



h,= ir,lh{t,,tk). (10.27) 
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Then wc have: 



4/i^(logn)2/3 ^ \Ah 



(10.28) 



Furthermore, as n cx), 



sup \KENk/n- gi{tk)\= sup 

k-.kyj k:k>j 



Kl 9i{t) dt - giitk) 
teik 



also on the last interval, since gi(t) — > on this interval. 



sup \KEMk/n- g2{tk)\ ^ sup 

k:k<j k:k<j 



Kl g2{t) dt - g2{tk) 
teik 



^0, 



also on the first interval, since 52 (i) -> on this interval. We also have: 



\K^EQ^^Jn~hit^,tk)\ 



K'' I h[t,u) dtdu — h{tj,ti.) 



uniformly for all t^., not belonging to the first or last interval, which may not 
have length 1/K (see the construction of the intervals of Birge's statistic at the 
beginning of section 3). But on these intervals we have 

h{t,tj) Ag2{t) =52(i) and h{tj,t) A gi{t) ^ gi{t), 

respectively. So we get: 



sup \iKENk/n) A (K^EQj^k/n) - gi{tk) Ah{tj,tk)\ ^ 0, 

k:k>j 



and 



sup \{KEMk/n) A (K'^EQkj/n) - g2{tk) A h{tk,tj) \ 

k:kKj 

Hence, we get from (10.28), (10.29) and (10.30), on the set Aj,^, 



(10.29) 



(10.30) 



i - k + 1 

k<] k>'j 

^{Mk/nA{KQkJn) 



fc-j + 1 



k<j 



j -k + l ■ " ^ k - j + 1 

■' k>j ■' 

^{EMk/n)A{KEQk^jln) 



^{Nk/n) A {KQj,k/n) 



k<] 



j-k + l 



k>j 



^{ENk/n) A {KEQ,^k/n) 
k-j + 1 
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K 1^. J-k + l 



\Jn(\. - e) J ^ ^Jgi{tk) A h{tk,tj) ^ ^ ^/gi{tk) A h{tj,tk) 



{k<j k>3 ■' ) 

and similarly 

^ ^ + e) J V.92fa) A/tfa,tj) ^ Vffifa) A fefe , tfc) 

Moreover, letting e„ = {\ogn)^^/^, we get:, 

^ j-k + l ^ k- j + 1 

k<j ■' k>j 

j-k + l ^ k- j + 1 

j-k + l ^ k- j + 1 

k:fj~fk>e„ ■' k:tk-tj>e„ 

= i{a(to) + 6(io)}(logn){l + o(l)}. 

Relation (3.9) now follows. 

To prove (3.10) we first note that 



E^kw,>o}nA^^^, =0(^{K + irn'/^ exp |- 



% (log n) 2/3 ^ \^hj 
where /i^- is defined by (10.27), since Wj > 1/{K + 1), if Wj > Thus wc find: 



< 



1 J y^ ^EMk A {KEQ^) y^ V-EA^fc A {KEQ.^k) 
(l-e)W2|Z. + i ,_fc + i J 

+ f (X + l)'"ni/3 exp 



^ ^ m/2 

."(1-e) 



4/lj(l0g?l)2/3 ^ \^4/i^. 

{(a(to) + &(to)) logn}"" ,n^oo. 
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and similarly 

1 ( m 

implying 



m/2 



{{a{tQ) + b(to)) log n} "^,12^00, 



9K 

n 



-iji 



{(a(to) + fo(io)) logn} ™,n^oo, 



which proves (3.10). 

Finally we get for j > k: 

(1 + fc - j)E - Wj,k\ l{w,>o} 



E 



< E 



{W,>Q} 



3a{tk) 



{a{to)+Kto)}\ogn 



{W,>0} 



+ 3a(tk)E 



{W,>0} 



{a{to) + b{to)}logn 



Applying the Cauchy-Schwarz inequality on the first term on the right-hand 
side yields, if j < fc, 



E 



{W,>0} 



{2 \ 1/2 
El^^{KNk/n) A {K^Qj,k/n) ~ 3a(tfe)| I ^ El/Wj 

= o(l/logn), 

uniformly in k, using (3.10) and the exponential inequalities for 

p[\Nk/n-ENk/n\ > and P {|Q,- fc/n - SQ,- > 

derived above. Using ((3.10)) again, we get that the second term satisfies the 
inequality 



< IE 



1/2 



{iy,>o} 



{a(^o) + &(^o)}fog?^ 



= o(l/logn). 
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The case fc < j is treated similarly. 
We also have: 

(1 + fc - j)E \wj^k - Wj^k\ l{Ty,=o} = (1 + - j) \wj,k\ P {Wj = 0} 
= 0(1/ log n), 

since, in fact, P {Wj — 0} tends to zero exponentially fast in n. This proves 
(3.8). □ 

We next discuss the proof of Theorem 4.1. Since the following lemmas have 
proofs analogous to the proofs of Lemma 10.1 and Lemma 10.2 in section 3 we 
omit their proofs. 

Lemma 10.4. Let the observation density h, the number of intervals K and 
the constant c be as in Theorem 3.1, and let tk = t^j!^^ be the left boundary point 
of a sub-interval of the partition, Moreover, let Fq have a continuous derivative 
on [0, 1], and let Gn,i and Gn,2 be the empirical distribution functions of the Ti 
and Ui, respectively. Then 



(i) 



{ 



Fo{tk) > l{Arfc>0} 




(10.31) 



where 




(ii) 



{ 



K 



Fo{tk) > l{A/fc>0} 




(10.32) 



where 




(Hi) Let tj = tQ. Then, if k > j , 




where 
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If k < j we get 

Q'j,k 



c^h{tk t,) + 5c{/o(^o) - /o(ifc)}| {1 + 0^,(1)} , (10.34) 
where 

Wn,,,k I {52- {Fo{u) - Fo(t)}} dP„(t,u,5). 

(iv) The Op(l) terms in (in) tend to zero uniformly in k. 

Lemma 10.5. Let the conditions of Theorem 4-1 be satisfied, and let tj = tg. 
Then, as n ^ oo, 



Wj^kUn,k \ ^ Wj^kVn,k 



EWj^k'-^n,k 



k-k>j ~ '^n,l {tk) lA^j '^n,2 (tk+l) — G„_2 (^fc) 



Since the first moment of the asymptotic distribution follows in a similar 
way as in section 3, using the representations of the components N^/Nk, etc. of 
Lemma 10.4, the proof of Theorem 4.1 again boils down to proving the following 
lemma. 

Lemma 10.6. Let the conditions of Theorem 4^.1 he satisfied, and let tj = to. 
Moreover, let Wnj.k be defined as in part (Hi) of Lemma 10.4, '^'nd let be 
defined as in Theorem 4-1- Then: 

k:k>j ^ ' k:k<j ^ ' ^' 

where the right-hand side denotes a normal random variable, with expectation 
and variance u^, defined by (4-'7)- 

Proof. Since the proof follows the same lines as the proof of Theorem 3.1, 
we only give the main steps. We define the martingale difference array in the 
same way as in the proof of Theorem 3.1. Then, if fc < j, we get the following 
representation of the conditional variance 

Vn,k 

nwlk 



C^h{tk,to) 



2 



■EU {Foiu)-Fo{t)}{l-Fo{u)+Fo{t)} dM,,{t,u)\Tn,k-i} 

[Jt<£lk,u£lj J 

n'^^wlk {Fo{to) ~ Fofa)} {1 ^ Fpfto) + fpfa)} {1 + Op(l)} 

c'^h{tk,to) 
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Similarly we get, if fc < j, 

Vn,k 



2 



■El {Fo(u)-Fo(i)}{l-Fo(u)+Fo(i)} dH„(i,u) I J-„,fe_i ^ 

_ {Fojtk) - Fofa)} {1 - fofa) + fofto)} {1 + Op(l)} 

c'^h(to,tk) 

Hence, using (4.3) and a Riemann sum approximation, we obtain: 

Y.Vn,k'^a^- (10.35) 

It remains to show the Lindeberg-type condition 

E^{Cfcl{l«,.,.l>^} I -^n.fe-l} ^0, (10.36) 

for each S > 0. We again use the conditional Cauchy-Schwarz inequality 



E {Ck^{\U,k\>S} I -^n.fe-l} < Y^^l^^fc I •^n,fe-l}£'{l{|{„,fc|>5} I -7^«,fc-l}- 

(10.37) 

We have: 

E {l{|c„,,|>5} I J-„,fc-i} < {C,k I -^n,fc-i} - Op (l/JC) - Op . 

(10.38) 

Letting po{t,u) = Fo{u) - Fo(t), Po(t,u) = 1 -po{t,u), we get, if fc < j: 

E {CiM I ^n,k-l] 



■El po{t,u)po{t,u)} [poit.uf +pQ{t,uf] dH„(i,u) I Tn,k- 



[Ell I poit,u){l~Poit,u)}dMnit,u) 



c»h{tk,toy 



(10.39) 



The first and second terms on the right-hand side are, respectively, of order 
Ov I ^TTTT — ^ VT] and 0„ 



imsart-ejs ver. 2011/12/01 file: interval.tex date: December 13, 2011 



p. Groeneboom and T. Ketelaars/ Interval censoring 



51 



So the second term is dominant, and hence: 

n2 



k<j 



nw 



ch{tk,toy 



■IE- 



Or. 



pa{t, u){l - pQ{t, u)} (Mn{t, u) 



1/2 



^ n,k- 



(to - tr 



dt = Op{l). (10.40) 



Similarly the sum of the terms for fc > j is Op(l). The result now follows from 
(10.37) and (10.38). □ 



The proof of Theorem 4.1 can now be finished by making the transition from 
the random weights to the deterministic weights, using Lemma 4.1 (see the proof 
of Theorem 3.1 at the end of section 3), and using the central limit result of 
Lemma 10.6. 
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